DEEP RECURRENT NEURAL NETWORK BASED AUDIO SPEECH RECOGNITION SYSTEM
نویسندگان
چکیده
منابع مشابه
Audio Visual Speech Recognition Using Deep Recurrent Neural Networks
In this work, we propose a training algorithm for an audiovisual automatic speech recognition (AV-ASR) system using deep recurrent neural network (RNN).First, we train a deep RNN acoustic model with a Connectionist Temporal Classification (CTC) objective function. The frame labels obtained from the acoustic model are then used to perform a non-linear dimensionality reduction of the visual featu...
متن کاملDeep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition
A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, w...
متن کاملStochastic Recurrent Neural Network for Speech Recognition
This paper presents a new stochastic learning approach to construct a latent variable model for recurrent neural network (RNN) based speech recognition. A hybrid generative and discriminative stochastic network is implemented to build a deep classification model. In the implementation, we conduct stochastic modeling for hidden states of recurrent neural network based on the variational auto-enc...
متن کاملStimulated Deep Neural Network for Speech Recognition
Deep neural networks (DNNs) and deep learning approaches yield state-of-the-art performance in a range of tasks, including speech recognition. However, the parameters of the network are hard to analyze, making network regularization and robust adaptation challenging. Stimulated training has recently been proposed to address this problem by encouraging the node activation outputs in regions of t...
متن کاملSpeeding up deep neural network based speech recognition systems
Recently, deep neural network (DNN) based acoustic modeling has been successfully applied to large vocabulary continuous speech recognition (LVCSR) tasks. A relative word error reduction around 20% can be achieved compared to a state-of-the-art discriminatively trained Gaussian Mixture Model (GMM). However, due to the huge number of parameters in the DNN, real-time decoding is a bottleneck for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: INFORMATION TECHNOLOGY IN INDUSTRY
سال: 2021
ISSN: 2203-1731
DOI: 10.17762/itii.v9i2.434